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ABSTRACT 

Given  that  a  network  is  only  as  strong  as  its  weakest 
link,  a  key  vulnerability  to  network  centric  warfare  is 
the  threat  from  within.  This  paper  summarizes  sev¬ 
eral  recent  MITRE  efforts  focused  on  characterizing 
and  automatically  detecting  malicious  insiders  within 
modern  information  systems.  Malicious  insiders  (MI) 
adversely  impact  an  organization’s  mission  through  a 
range  of  actions  that  compromise  information  confi¬ 
dentiality,  integrity,  and/or  availability.  Their  strong 
organizational  knowledge,  varying  range  of  abusive 
behaviors,  and  ability  to  exploit  legitimate  access 
makes  their  detection  particularly  challenging.  Cru¬ 
cial  balances  must  be  struck  while  performing  MI  de¬ 
tection.  Detection  accuracy  must  be  weighed  against 
minimizing  time-to-detect  and  aggregating  diverse 
audit  data  must  be  balanced  against  the  need  to  pro¬ 
tect  the  data  from  abuse.  Key  lessons  learned  from 
our  MI  research  include  the  need  to  understand  the 
context  of  the  user’s  actions,  the  need  to  establish 
models  of  normal  behavior,  the  need  to  reduce  the 
time  to  detect  malicious  behavior,  the  value  of  non 
cyber-observables,  and  the  importance  of  real-world 
data  collections  to  evaluate  potential  solutions. 


1.  The  Threat:  Malicious  Insiders 

An  insider  as  anyone  in  an  organization  with  approved  ac¬ 
cess,  privilege,  or  knowledge  of  information  systems,  infor¬ 
mation  services,  and  missions.  A  malicious  insider  (MI)  is 
one  motivated  to  adversely  impact  an  organization’s  mission 
through  a  range  of  actions  that  compromise  information  con¬ 
fidentiality,  integrity,  and/or  availability.  Analysis  of  the 
behavior  of  dozens  of  malicious  insiders  (DSS  1999,  Herbi 
and  Wiskoff  2002)  and  our  detailed  analysis  of  six  represen¬ 
tative  cases  (Maybury  et  al.  2004)  such  as  CIA’s  Aldrich 
“Rick”  Ames,  FBI’s  Robert  Philip  Hanssen  (2003),  and 
DIA’s  Ana  Belen  Montes  (2001)  illustrates  the  diversity  of 
the  insider  threat  challenge.  Each  of  these  cases  is  unique  in 
terms  of  their  position,  motive,  foreign  handlers,  impact,  sen¬ 
tence,  computer  skill,  polygraph  experience,  cyber  security 
violations,  counter  intelligence  activities,  physical  and  cyber 
access,  cyber  extraction  and  exfiltration,  cyber  communica¬ 


tion,  and  the  transfer  of  materials  to  foreign  handlers.  The 
devastating  impact  of  these  three  individuals  alone  included 
the  violation  of  confidentiality,  undermining  of  intelligence 
integrity,  adverse  influence  of  US  policy,  the  revelation  of 
sources  and  methods,  and  the  death  and  compromise  of  field 
agents.  MI  motives  are  diverse,  ranging  from  financial  to 
thrill  to  ideological.  In  each  of  these  cases,  handlers  were 
professional  foreign  service  agents.  Two  of  the  three  passed 
polygraphs.  While  the  computer  skills  of  each  of  these  insid¬ 
ers  ranged  significantly,  all  left  trails  of  suspicious  cyberac¬ 
tivity  while  performing  cyber  access,  exfiltration,  and/or 
communication.  All  engaged  in  counter  intelligence  to  evade 
detection  and/or  destroy  incriminating  evidence.  In  each  case 
we  found  opportunities  to  observe  individual  incidents  and/or 
to  detect  anomalous  behavior  from  correlated  observables. 

2.  Approach  and  Research  Agenda 

Our  study  of  Mis  supports  the  finding  that  there  is  no  single 
silver  bullet  solution  to  the  problem.  Accordingly,  we  take  a 
wholelistic  approach  which  incorporates  prevention,  detec¬ 
tion,  and  reaction.  In  this  paper  we  focus  on  our  efforts  aimed 
at  automated  detection.  Our  analysis  has  led  us  to  explore 
several  fundamental  hypotheses  including: 

1.  While  some  Mis  can  be  detected  using  a  single  cyber  ob¬ 
servable,  other  Mis  could  be  detected  only  by  using  multiple 
and  heterogeneous  observables. 

2.  Fusing  information  from  heterogeneous  information 
sources  (e.g.,  logs  from  printers,  authentication,  card  readers, 
telephone  calls)  and  various  levels  of  the  IP  stack  (e.g.,  appli¬ 
cation  vs.  network  traffic)  allows  more  accurate  and  timely 
indications  and  warning  of  malicious  insiders.  Even  with  a 
single  sensor,  if  you  monitor  a  broad  range  of  activities  it  will 
increase  your  detection  rates. 

3.  Observables  together  with  domain  knowledge  (e.g.,  user 
role,  asset  value  to  mission)  can  help  detect  inappropriate 
behavior  (e.g.,  need  to  know  violations). 

Our  basic  approach  is  consistent  with  an  overall  strategy  that 
aims  to  prevent,  detect,  and  react  to  insider  threats  while  bal¬ 
ancing  privacy  and  security.  Our  research  methodology  in¬ 
cludes  conducting  studies  under  the  auspices  of  an  independ- 
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ent  review  board  (IRB)  together  with  measures  of  anonymiza¬ 
tion  and  aggregation  to  ensure  the  protection  of  privacy. 

The  remainder  of  this  paper  first  summarizes  our  experience 
in  an  insider  threat  challenge  workshop  to  assess  the  ability 
of  several  distinct  sensors  approaches  to  detect  three  simu¬ 
lated  insiders  on  a  live  network.  We  then  describe  an  initia¬ 
tive  to  develop  a  broad  set  of  context-sensitive  rules  and  fuse 
individual  indicators  into  overall  threat  scores  to  highlight 
potentially  abusive  behavior.  Input  is  based  on  passive,  net¬ 
work-based  sensors  that  monitor  how  users  interact  with  in¬ 
formation  taking  advantage  of  models  of  context  of  users  and 
information.  We  conclude  by  identifying  lessons  learned 
from  our  investigations  as  well  as  future  research  directions. 

4.  Insider  Threat  Challenge  Workshop 

In  order  to  enhance  understanding  of  and  accelerate  solutions  for 
the  insider  threat,  a  collaborative,  six  month  challenge  workshop 
was  held  to  characterize  and  create  analysis  methods  to  counter 
sophisticated  malicious  insiders  (Maybury  et  al.  2004).  Follow¬ 
ing  a  careful  study  of  past  and  projected  cases,  several  prototype 
techniques  were  developed  to  provide  early  warning  of  insider 
activity,  including  novel  algorithms  for  structured  analysis  and 
data  fusion.  The  algorithms  were  assessed  in  an  operational 
network  against  three  distinct  classes  of  human  insiders  (an  ana¬ 
lyst,  application  administrator,  and  system  administrator),  meas¬ 
uring  timeliness  and  accuracy  of  detection,  which  we  subse¬ 
quently  describe. 

4.1  Simulated  Mis:  Pal,  Jill,  and  Jack 

Grounding  our  efforts  in  realistic  insider  behavior,  we  ex¬ 
plored  detecting  three  types  of  insiders  in  detail  in  this  activ¬ 
ity.  The  first  was  a  historical  insider  modeled  as  a  prototype 
of  past  need-to-know  violators.  We  call  this  insider  Pal.  A 
second  insider,  named  Jack,  was  a  projected  insider  who 
would  aim  to  disrupt,  damage,  or  destroy  the  network  or  ele¬ 
ments  thereof.  In  the  course  of  defining  and  simulating  these 
insiders,  the  scenario  team  implemented  a  third  category  of 
insider,  an  application  administrator,  called  News  Admin  or 
Jill.  Only  Pal’s  behavior  model  was  disclosed  to  sensor 
builders  prior  to  the  experiment.  For  detail  about  these  in¬ 
siders  including  a  log  of  specific  actions  taken  by  the  insiders 
see  Maybury  et  al.  (2004).  The  three  malicious  insider  cases 
were  simulated  on  MITRE’s  Demilitarized  Zone  (DMZ)  net¬ 
work.  The  DMZ  consists  of  over  300  hosts  with  a  range  of 
missions  utilizing  services  such  as  web  (HTTP),  news 
(NNTP),  file  transfer  (FTP),  messaging  (SMTP),  mail  (POP, 
IMAP),  database  (SQL),  and  question  answering.  We  in¬ 
strumented  18  of  31  nodes  on  the  NRRC  (Northeast  Regional 
Research  Center)  subnetwork  which  had  75  on-line,  active 
users  during  the  evaluation. 

A  semi-automated  process  captured,  filtered,  and  anonymized 
the  malicious  insider  collection  to  address  security  and  pri¬ 
vacy  concerns.  Figure  1  illustrates  the  heterogeneous  nature 
of  the  collection  consisting  of  over  1 1  million  records  which 
spans  physical  sensors  (e.g.,  employee  badge  readers),  net¬ 
work  level  sensors  (e.g.,  Snort  rules  modified  to  detect  inap¬ 
propriate  connections  or  behavior),  host  sensors  (to  detect 
user  access  and  command  sequences),  and  applications  (e.g., 


mail  server  logs,  web  server  logs,  network  news  logs).  A 
Common  Data  Repository  (CDR)  was  established  as  a  central 
database  storing  the  over  1 1  million  anonymized,  time 
stamped  audit-log  records  collected  over  three  months. 


Application 


Host 


Network 


Physical 


Figure  1.  Heterogeneous  and 
Multilevel  Data  Sources 


4.2  Event  and  Observable  Taxonomy 

In  order  to  access,  exploit,  or  damage  assets,  a  MI  will  necessar¬ 
ily  need  to  perform  (or  have  another  person  or  process  perform) 
a  series  of  actions  to  gain  privileges,  access  or  manipulate  assets. 
Derived  from  our  analysis  of  MI  cases,  Figure  2  shows  a  taxon¬ 
omy  of  cyber  events  which  have  associated  observables  that  hold 
promise  for  the  foundation  of  a  detection  system.  The  taxonomy 
distinguishes  observables  in  the  cyber  domain  from  those  in  the 
physical  domain.  The  taxonomy  includes  observables  such  as 
results  of  the  polygraph,  records  of  security  violations,  missing 
or  misleading  reports  on  finances,  foreign  travel  or  foreign  con¬ 
tacts,  physical  facility  access,  personal  finances,  materials  trans¬ 
fer,  counter  intelligence,  social  behavior,  and  communications. 

In  this  research  we  focused  exclusively  on  cyber  observables, 
including  other  observables  that  could  be  readily  converted  to  a 
cyber  signal  (e.g.,  digitized  facility  access  logs). 


Figure  2.  Cyber  Event/Observable  Taxonomy 

The  core  of  the  taxonomy  incorporates  a  range  of  cyber  observ¬ 
ables  encompassing  a  range  of  classes  of  cyber  actions  indicated 
in  bold  italics  in  Figure  2.  These  include  activities  of  network, 
system,  and  information  reconnaissance,  access  to  assets  (e.g., 
media,  hosts,  accounts),  entrenchment  (e.g.,  installing  sensors  or 
unauthorized  software),  exploitation  (e.g.,  commanding  and  con¬ 
trolling  entrenched  assets  such  as  software  bots  or  zombie  ma- 


chines),  extraction  and  exfiltration  (e.g.,  of  hardcopy,  media, 
information),  communication  (e.g.,  encrypted  messaging,  en¬ 
coded  messages,  covert  channels),  manipulation  of  cyber  assets 
(e.g.,  changing  file  permissions,  suppressing  or  altering  informa¬ 
tion  content),  counter  intelligence  (e.g.,  wiping  disks),  and  other 
cyber  activities  associated  with  unethical  or  addictive  behavior 
(e.g.,  on  line  gambling).  Some  observables  have  been  used  in 
some  historical  cases  as  a  tip-off  of  malicious  activity;  others 
serve  as  direct  indicators  of  inappropriate  behavior. 


COMMON  DATA 

-  Authentication,  Mail,  DMZ 
Servers,  IDS,  I  loneynet,  BadgeData 

-  Application  Logs  (e.g.,  web,  DB,  mail) 

-  Nessus  Scans  (vulnerability  analysis) 

-  Switch  logs.  Stealth  Watch  logs 


Figure  3.  Integrated  Architecture 
for  Insider  Detection  System 

4.3.  Insider  Detection 

While  the  live  network  instrumentation  describe  in  Section  3 
provided  an  unprecedented  and  essential  set  of  MI  experimental 
data,  the  thrust  of  our  activity  was  developing  novel  algorithms 
to  detect  Mis.  Figure  3  illustrates  the  high  level  architecture  of  a 
proof  of  concept  system  that  was  designed,  implemented,  and 
tested  to  detect  Mis.  Distributed,  heterogeneous  sensors  provide 
input  to  a  Common  Data  Repository  (CDR)  from  which  a  range 
of  analyses  are  performed  including  data  fusion  and  structural 
analysis  to  identify  potential  suspects  on  a  watch  list  or  issue  an 
alert  of  an  insider  threat.  As  illustrated  in  Figure  3,  our  technical 
approach  is  novel  in  the  following  respects: 

•  A  Common  Data  Repository  (CDR)  captures  and  ano¬ 
nymizes  heterogeneous  sensor  input. 

•  Multilevel  monitoring  occurs  at  the  packet  level,  system 
level,  and  application  level. 

•  StealthWatch  sensors  detect  abnormal  insider  behavior  on 
the  network  such  as  scanning,  file  transfer,  or  internal  net¬ 
work  connections. 

•  Distributed  honeynets  acquire  attacker  properties,  pre-attack 
intensions,  and  potential  attack  strategies. 

•  A  real-time,  top-down  structural  analysis  drawing  upon 
functional  models  of  Mis  maps  pre-attack  indicators  to 
models  of  potential  Mis. 

•  Traditional  and  non-traditional  indicators  (e.g.,  logs  of  net¬ 
work  activity,  physical  access,  PBX,  help  desks),  including 
non-digital  sources,  are  fused  bottom-up. 


Sensor  inputs  are  then  exploited  by  a  decision  analysis  compo¬ 
nent  to  determine  watch  list  membership  and  insider  detection. 
We  next  consider  each  of  the  primary  detection  strategies. 


4.4  HoneyTokens 

Honeypots  are  realistic  but  dummy  systems  that  reflect  true  pro¬ 
duction  systems  and  are  designed  to  attract  malicious  users  to 
inappropriately  access  resources.  Combined  with  subtlety  ad¬ 
vertised  enticements  to  potential  insider  threats,  honeypots  pro¬ 
vide  a  mechanism  to  determine  what  motivates  the  inside  at¬ 
tacker  and  what  capabilities  the  attacker  possesses. 

A  novel  idea  developed  during  the  workshop  and  applied  in 
the  insider  detection  process  is  the  notion  of  a  honey  token.  A 
honeytoken  is  a  semi-valuable  piece  of  information  whose 
use  can  be  readily  tracked.  This  could  be  a  credit  card  num¬ 
ber,  an  Excel  spread  sheet,  a  database  entry,  or  a  login  and 
password.  A  honeytoken  is  an  entity  that  has  no  authorized 
use.  Honeytokens  can  be  used  for  the  initial  detection  of  in¬ 
sider  threats,  then  those  threats  can  be  redirected  to  honeynets 
to  confirm  if  a  violation  has  occurred,  potentially  learning 
more  about  the  threat. 

In  the  Pal  scenario  described  in  Section  3,  the  honeytoken 
takes  the  form  of  a  web  page  which  lists  (fictitious)  opera¬ 
tives  in  the  geographic  region  of  interest  to  the  MI.  The  data 
fusion  group,  detailed  in  a  subsequent  section,  exploits  the 
detection  of  honeytoken  access  as  one  of  a  range  of  indicators 
of  malicious  behavior.  In  other  examples  of  honeytokens 
(e.g.,  login  password)  it  is  possible  to  use  the  false  informa¬ 
tion  to  track  activities  (e.g.,  in  a  controlled  account)  to  more 
readily  discover  MI  actions,  capabilities,  and  intentions. 

4.5  Stealthwatch 

One  of  the  sensor  types  deployed  in  support  of  the  insider 
threat  workshop  was  StealthWatch.  StealthWatch 
(http://www.lancope.com)  provides  traffic  profiling  (e.g., 
data  flow  analysis)  and  host  profiling  such  as  zone  profiling 
(e.g.,  monitoring  which  hosts  can  talk  to  which).  Stealth- 
Watch  creates  a  concern  index  (Cl)  based  on  flow  analysis, 
reconnaissance  activity,  and  anomaly  detection.  Any  host 
reaching  an  accumulated  Cl  level  above  20,000  generates  an 
alarm.  Stealthwatch  was  deployed  to  monitor  scanning, 
downloads,  and  inside  connections.  StealthWatch  enabled  an 
analyst  to  identify  Jack  within  a  day  of  his  malicious  activity, 
which  initiated  on  February  1 1th.  However,  StealthWatch  did 
not  detect  Pal  and  Jill  who  exhibited  no  scanning,  download 
or  insider  connection  anomalies. 

4.6  Structured  Analysis 

Using  the  cyber  observable  framework  introduced  at  the  be¬ 
ginning  of  this  article  in  Figure  2,  the  structured  analysis  ap¬ 
proach  considered  a  range  of  cyber  observables  associated 
with  adversaries  to  perform  top-down,  real-time,  model-based 
detection  of  Mis.  Knowledge  of  network  mission  and  net¬ 
work  configuration  together  with  likely  adversarial  actions 
were  used  to  filter  and  highlight  adversary  behavior.  MI  ac¬ 
tions  modeled  included  reconnaissance  (e.g.,  via  web  brows¬ 
ing  or  net  scans),  access  (e.g.,  privilege  escalation),  en- 
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trenchment  (e.g.,  via  deployment  of  sensors),  extraction  and 
exfiltration  (e.g.,  unauthorized  downloads),  and  communica¬ 
tions  (e.g.,  coded  messages  or  covert  channels).  A  range  of 
sensors  and  data  logs  are  relevant  to  detecting  many  of  the 
cyber  actions.  For  example  email  communication  patterns 
would  provide  insight  into  the  social  network  of  a  malicious 
insider.  Conversely,  large  downloads  might  signal  data  exfil¬ 
tration. 

The  structural  analysis  group  (SAG)  modeled  two  insiders, 
Pal  and  Jack,  considering  temporal  characteristics  of  proto¬ 
cols  such  as  event  proximity  (e.g.,  immediate  vs.  days  vs. 
years)  and  observable  ordering.  The  Pal  detector  exhibited 
3%  false  positives  and  no  false  negatives  and  the  Jack  detec¬ 
tor  had  1%  false  positives  and  50%  false  negatives. 

The  structured  analysis  approach  detected  all  three  simulated 
Mis,  Pal,  Jill,  and  Jack,  at  various  times.  Pal  was  put  on  the 
watch  on  December  11th,  2003,  two  days  after  Pal  initiated 
behavior.  Six  days  later  an  alert  was  issued  for  Pal.  The  News 
Administrator  Jill  was  put  on  the  watch  list  on  January  24th, 
2004,  even  before  her  suspicious  behavior  started  on  Febru¬ 
ary  11th.  Four  days  later,  on  February  15th,  SAG  issued  an 
alert  for  Jill.  Finally,  SAG  issued  an  alert  on  17  February  for 
Jack,  six  days  after  he  initiated  malicious  administrator  be¬ 
havior. 

4.7  Data  Fusion 

In  contrast  to  the  preceding  model-based,  top-down  analysis 
of  observables,  a  data  fusion  approach  explored  bottom-up 
combination  of  observables  from  multiple  sensors  including 
the  combination,  correlation  and  analysis  of  both  traditional 
and  novel  indicators  stored  in  the  Common  Data  Repository. 
This  low-level  data  came  from  available  resources  on  the 
MITRE  DMZ  network  along  with  inputs  from  other  teams 
(StealthWatch  and  honeynets).  Thus  data  fusion  occurred 
over  input  from  the  physical,  network,  host  and  application 
levels.  Fusion  can  occur  along  a  number  of  dimensions  such 
as  the  type  of  sensor  (e.g.,  card  reader,  authentication,  printer, 
telephone  calls)  and  the  level  of  IP  stack  (e.g.,  from  network 
to  application). 

A  proof-of-concept  data  fusion  engine  was  developed  that: 

•  Fused  multiple  indicators  that  were  related  by  IP,  user  name, 
etc.  into  a  single  indicator 

•  Allowed  customer  selectable  weights  for  each  element  in 
the  observable  taxonomy 

•  Allowed  site  specific  criteria  for  generating  an  alert  and  for 
submission  of  names  to  a  watch  list  based  on: 

o  Accumulated  indicator  weight 
o  Accumulated  indicator  breadth,  i.e.,  the  number  of 
unique  sources  of  evidence 

In  addition,  a  novel  sensor  was  developed  that  performed 
email  consistency  checking.  The  email  sensor  analyzed  PGP 
signatures  and  PGP-encrypted  attachments  and  generated 
error  messages  to  report  if  the  PGP  data  did  not  decode  prop¬ 
erly.  A  simple,  flexible  data  fusion  engine  was  implemented 
in  the  JESS  rule  system. 


Figure  4  illustrates  a  series  of  logged  messages  from  the  op¬ 
eration  of  the  data  fusion  engine  that  report  various  infer¬ 
ences  being  made  on  the  basis  of  sensor  input  for  the  scenar¬ 
ios  on  the  DMZ  network  introduced  in  Section  3. 

The  data  fusion  engine  was  able  to  find  indications  of  malicious 
activity  in  two  out  of  the  three  scenarios,  with  little  tuning.  One 
in-the-wild  incident  was  discovered,  in  particular  a  user  who  was 
excessively  downloading  news.  Perhaps  equally  significant  is 
the  substantial  data  reduction  that  resulted.  The  data  fusion  en¬ 
gine  needed  to  examine  7.4  million  records  to  analyze  the  activ¬ 
ity  of  the  75  users  on  the  system.  259  indicators  were  generated 
for  24  of  those  users.  The  data  fusion  approach  exhibited  very 
low  false  positive  rates  (0  or  3%)  with  moderate  to  high  false 
negative  rates  (33  to  66%)  using  a  threshold  of  the  maximum 
change  between  any  two  detected  users. 


Cyber-Access,  user324,  weight  1,  at  2003-12-10  1 1:14:38,  fromnews.mitre.org 

su  to  user9676  failed  for  non-admin  user  user324  on  /dev/pts/0 


Physical-Access,  user295,  weight  5,  at  2003-12-15  19:19:37, 

After  hours  badge  access  for  user295 


Cyber-Extraction-Exfiltration,  user2649,  weight  5,  at  2004-01-06  15:37:28, 
from  nrrc-springfield.mitre.org, 

Data  was  uploaded  to  an  external  server  via  FTP  protocol 


Cyber-Reconnaissance ,  user295,  weight  10,  at  2004-01-09  20:57:18,  fromnrrc- 
springfield.  mitre .  org, 

User  user295  searching  in  non-need-to-know  country  korea 


Cyber-Communication,  user9,  weight  15,  at  2004-02-10  22:14:48,  from 
cvw.mitre.org, 

User  user9  received  email  with  masqueraded  content  from 
userll649@yahoo.com 


Cyber-Reconnaissance,  userl,  weight  5,  at  2004-02-10  13:54:15,  fromnrrc- 
plymouth.  mitre .  org, 

Ongoing  Cl  violation  --  066.170.227.074  currently  has  49613  alerts  of  this 

type... 


Cyber-Extraction-Exfiltration,  user295,  weight  8,  at  2004-02-12  23:54:58,  from 
dmzsrv  1  .mitre.org, 

User  user295  sent  encrypted  email  to  user9983@comcast.net 


Cyber-Extraction-Exfiltration,  userl,  weight  15,  at  2004-02-20  12:25:03,  from 
nrrc  -erie .  mitre .  org , 

userl  sent  email  with  masqueraded  content  to  userl@mitre.org 

Figure  4.  Data  Fusion  Log 
4.8  Evaluation:  Accuracy  and  Speed 

We  designed  evaluation  methods  to  assess  both  the  accuracy 
and  timeliness  of  insider  threat  detection.  Accuracy  was 
measured  in  terms  of  false  positives  and  false  negatives  ad 
defined: 

•  False  positives  =  #  users  incorrectly  detected  as  malicious 
insiders 

•  False  negatives  =  total  #  actual  malicious  insiders  -  #  cor¬ 
rectly  detected 

For  each  sensor,  the  maximum  change  between  any  two  detected 
users  is  computed,  and  then  false  positive  and  false  negative 


measures  are  computed  at  several  points  along  the  graph.  Detec¬ 
tion  results  were  measured  at  three  weighted  thresholds:  maxi¬ 
mum  change,  50%  of  the  highest  factor,  and  10%  of  the  highest 
factor. 


In  contrast  to  accuracy,  timeliness  were  measured  as  the  differ¬ 
ence  between  the  time  when  an  insider’s  malicious  activity  be¬ 
gins,  the  time  at  which  they  are  put  on  a  watch  list,  and  the  time 
at  which  an  insider  threat  alarm  occurs.  We  consider  these 
measures  stricter  than  would  be  found  in  an  operational  envi¬ 
ronment  which  would  include  a  human  analyst  in  the  loop  to 
make  alert  decisions  (Stealthwatch  results  did  include  a  human 
analyst,  whereas  the  other  methods  were  autonomous). 
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Figure  5.  Timeliness  of  Detection  Results 


Figure  5  summarizes  the  timeliness  performance  of  the 
StealthWatch,  structured  analysis,  and  data  fusion  detectors 
for  the  three  insiders:  Pal,  Jill  and  Jack.  One  objective  was 
to  reduce  the  time  from  defection  to  the  time  of  detection 
from  years  to  months  to  weeks  to  days  if  not  minutes.  In 
Figure  5,  the  eye  icon  indicates  the  day  when  the  sensor  put 
the  insider  on  the  watch  list.  The  bell  indicates  the  first  day 
when  an  alert  is  issued.  The  black  vectors  indicate  the  start 
(left  of  the  vector)  and  stop  (right  of  the  arrow  head)  times  of 
the  malicious  behavior  by  the  three  human  Mis.  The  bold 
font  dates  are  associated  with  the  StealthWatch  sensor  (which 
only  detected  Jack),  the  underlined  dates  are  for  the 
structured  analysis  method,  and  the  italicized  ones  are  for  the 
data  fusion  performance.  In  summary,  when  reviewed  across 
all  sensors  and  methods,  of  our  three  Mis,  two  were  detected 
within  one  week  of  their  initiation  of  suspicious  activity  and 
the  third  was  detected  within  two  weeks.  Even  removing 
some  more  obvious  indicators  such  as  the  scanning  behavior 
of  Jack,  because  a  multiplicity  of  sensors  provide  evidence 
for  inferences  Jack  would  still  be  detected. 


5.  Context  Sensitive 

Malicious  Insider  Detection 

One  goal  is  to  help  experts  identify  insider  threats  as  quickly  as 
possible,  ideally  before  they  can  do  damage.  Developing  profiles 
of  a  typical  employee’s  online  behaviors — how  they  search,  how 
they  access  networks,  how  they  print  documents — and  “adding 
context”  to  those  actions,  can  help  pinpoint  behaviors  that  fall 
outside  normal  boundaries.  Adding  context  involves  human  ef¬ 
fort  along  with  the  electronic  detective  work.  This  includes  un¬ 
derstanding  users’  role,  normal  relationships  with  others,  and 
normally  information  usage  patterns. 

One  of  the  key  challenges  illustrated  in  the  insider  threat  chal¬ 
lenge  workshop  is  the  fact  that  inappropriate  behavior  for  one 
user  could  be  considered  appropriate  for  another.  In  a  related 
initiative,  we  have  developed  a  broad  set  of  context-sensitive 
rules  and  have  fused  individual  indicators  into  overall  threat 
scores  to  highlight  potentially  abusive  behavior.  Input  is  based 
on  passive,  network-based  sensors  that  monitor  how  users  inter¬ 
act  with  information  taking  advantage  of  models  of  context  of 
users  and  information. 

Analysts  must  dig  deeper  to  find  out  if  the  insider  has  a  legiti¬ 
mate  reason  to  access  the  materials  in  question  by  considering 
what  is  this  staffs  role,  with  whom  do  they  normally  interact, 
and  what  information  are  they  trying  to  get.  For  example,  if  a 
computer  technician  starts  searching  on  terms  that  an  analyst 
would  use,  this  might  send  up  a  red  flag.  Accordingly  we  have 
Been  developing  and  testing  information-use  sensors  and  user 
attribution  techniques,  along  with  the  development  and  testing  of 
context-sensitive  rules  that  help  users  determine  when  insiders 
transfer  files  in  unusual  ways  or  suddenly  change  their  informa¬ 
tion-seeking  behavior. 

Accordingly  some  of  the  key  elements  of  the  technical  approach 
include: 

•  Monitor  how  users  interact  with  data  by  sensing  and  trans¬ 
lating  the  network  protocols  tied  to  information  use 

•  Establish  methods  to  attribute  events  to  users 
(vs.  IP  addresses). 

•  Deploy  software  agents  to  collect  user  and  information  con¬ 
text 

•  Develop  a  broad  set  of  context-sensitive  rules  to  highlight 
potentially-abusive  behavior,  and 

•  Combine  indicators  into  a  scoring  system  to  prioritize 
threats. 

For  a  given  user,  sensors  issue  an  alert,  which  is  linked  to  an 
algorithm  that  helps  us  find  the  probability  that  the  user  is  mali¬ 
cious.  The  analyst  is  given  a  threat  score ,  a  ranking  of  all  the 
employees  in  the  organization  based  on  this  probability.  Our 
accomplishments  to  date  include: 

•  The  development  and  successful  testing  of  our  information- 
use  sensors,  user  attribution  techniques,  and  data 
anonymization  routines. 
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•  The  collection  of  a  large,  realistic,  real-world  background 
data  set  representing  over  300  days  of  activity  and  over 
3000  users. 

•  The  crafting  and  execution  of  8  malicious  insider  scenarios 
and  the  tools  to  integrate  them  into  the  background  data  set 
for  testing  purposes. 

•  The  development  and  testing  of  context-sensitive  rules  that 
detect  printing  anomalies,  reconnaisance,  search  and 
acquisition  of  distant  or  unusual  information,  unusual  file 
movement,  changes  in  information  seeking  behavior,  and 
evasive  information  seeking 

Future  plans  include  the  refinement  of  a  user  interface  and 
analysis  tool  to  help  analysts  further  refine  their  searches  for 
malicious  actors. 

6.  Summary 

Malicious  insiders  pose  perhaps  the  most  serious  threat  to  organ¬ 
izational  cyber  assets.  Malicious  insider  behavior  is  distinct 
from  that  of  classical  external  intruders  and  cannot  be  detected 
using  traditional  intrusion  detection  methods.  In  this  article,  we 
report  results  from  a  challenge  workshop  that  demonstrated  how 
an  integration  of  multiple  approaches  promises  early  and  effec¬ 
tive  warning  and  detection  for  a  range  of  insider  threats.  We 
also  report  our  efforts  to  create  context  sensitive  malicious  in¬ 
sider  detection. 

However,  while  this  research  makes  initial  contributions  to  the 
malicious  insider,  it  equally  raises  many  new  research  directions. 
These  include  the  need  for  more  refined  malicious  insider  mod¬ 
els,  more  elaborate  cyber  actions/observables  taxonomies,  more 
comprehensive  test  corpora,  and  more  sophisticated  detection 
algorithms.  Effective  counter  MI  programs  should  encompass 
protection,  detection,  and  reaction  elements  and  must  address 
challenges  of  data  fusion,  sensor  accuracy,  real  time  detection, 
and  privacy. 
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